An improved algorithm for feature selection using fractal dimension

نویسندگان

  • Haiqin Zhang
  • Chang-Shing Perng
  • Qingsheng Cai
  • Thomas J. Watson
چکیده

Dimensionality reduction is an important issue in data mining and machine learning. Traina[1] proposed a feature selection algorithm to select the most important attributes for a given set of n-dimensional vectors based on correlation fractal dimension. The author used a kind of multi-dimensional “quad-tree” structure to compute the fractal dimension. Inspired by his work, we propose a new and simpler algorithm to compute the fractal dimension, and design a novel and faster feature selection algorithm using correlation fractal dimension, whose time complexity is lower than that of Traina’s. The main idea is when computing the fractal dimension of (d-1)-dimensional data, the intermediate generated results of the extended d-dimensional data is reused. It inherits the desirable properties described as in [1]. Also, Our algorithm does not require the grid sizes decrease by half as the original “quad-tree” algorithm. Experiments show our feature selection algorithm has a good efficiency over the test dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification

In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...

متن کامل

An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification

In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...

متن کامل

ارائه یک روش برچسب ‌گذاری سیگنال‎های مغزی به‎منظور طبقه‎بندی حالت‎های مختلف بیهوشی

 Aims and background:    This    study    develops    a    computational    framework    for    the    classification    of    different    anesthesia    states,    including    awake,    moderate    anesthesia,    and    general    anesthesia,    using    electroencephalography    (EEG)    signals    and    peripheral    parameters. Materials and Methods: The    proposed    method    proposes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002